Syntactic heads in statistical language modeling
نویسندگان
چکیده
The use of syntactic structure in general and heads of syntactic constituents in particular has recently been shown to be beneecial for statistical language modeling. This paper provides an insightful analysis of this role of syntactic structure. It is shown that the predictive power of syntactic heads is mostly complementary to the predictive power of N-grams: they help in positions where an intervening phrase or clause separates the heads from the word being predicted, making the N-gram a poor predictor. Furthermore , a signiicant portion of this predictive power comes in the form of a more sophisticated back-oo eeect via the syntactic categories (nonterminal tags) of the heads. Finally, it is shown that using the categories of the syntactic heads is better than using the categories (part-of-speech tags) of the two preceding words, connrming that it is the syntactic analysis and not just the improved back-oo strategy which leads to improvements over N-gram models. Experimental results for perplexity and word error rate are presented on the Switchboard corpus to support this analysis.
منابع مشابه
Shallow Semantics with Shallow Syntax
Assigning semantic roles to the constituents of a natural language sentence is an important first step in translating natural language into a logical form for further processing. I present a statistical classifier which can perform this task using minimal syntactic cues. I use the syntactic and the semantic head of each constituent as the only features and present simple rules for extracting th...
متن کاملStatistical Language Modeling with Performance Benchmarks using Various Levels of Syntactic-Semantic Information
Statistical language models using n-gram approach have been under the criticism of neglecting large-span syntactic-semantic information that influences the choice of the next word in a language. One of the approaches that helped recently is the use of latent semantic analysis to capture the semantic fabric of the document and enhance the n-gram model. Similarly there have been some approaches t...
متن کاملPerception Development of Complex Syntactic Construction in Children with Hearing Impairment
Objectives: Auditory perception or hearing ability is critical for children in acquisition of language and speech hence hearing loss has different effects on individuals’ linguistic perception, and also on their functions. It seems that deaf people suffer from language and speech impairments such as in perception of complex linguistic constructions. This research was aimed to study the pe...
متن کاملGender-Based investigation of the Syntactic Development of Iranian EFL Learners: A Focus on Processabilty Theory
Pienemann (1998, 2015) put forward Processability Theory to enlighten why language learners follow definite developmental paths. The aim of the present study was to run a comparative investigation into the difficulty order of different grammatical structures for male and female Iranian EFL learners predicted by Processability Theory. 185 Iranian university students took part in this study. They...
متن کاملTitle of dissertation : DECISION TREE - BASED SYNTACTIC LANGUAGE MODELING
Title of dissertation: DECISION TREE-BASED SYNTACTIC LANGUAGE MODELING Denis Filimonov, Doctor of Philosophy, 2011 Dissertation directed by: Dr. Mary Harper Department of Computer Science Dr. Philip Resnik Department of Linguistics Statistical Language Modeling is an integral part of many natural language processing applications, such as Automatic Speech Recognition (ASR) and Machine Translatio...
متن کامل